This summary table provides a high-level snapshot of system volume for April. With over 612,000 total flights and an average of roughly 19,700 flights per day.
Daily Flights vs 3-Day Rolling Average
This visualization shows the overall flight activity throughout April and compares the daily flight counts to a 3-day rolling average.
The number of flights through out april stay fairly stable around 20,000 flights per day. Daily flight counts fluctuate in a clear weekly pattern: weekday volumes are consistently higher, while weekends show noticeable dips, which explains the repeating troughs in the blue line.
While delays frustrate passengers, early departures can be equally problematic as passengers arriving at the scheduled time could miss their flights entirely.
Dallas/Fort Worth, TX: Dallas/Fort Worth International
B
Weather
888
Chicago, IL: Chicago O'Hare International
B
Weather
844
Denver, CO: Denver International
B
Weather
467
Charlotte, NC: Charlotte Douglas International
B
Weather
382
Minneapolis, MN: Minneapolis-St Paul International
B
Weather
294
Dallas, TX: Dallas Love Field
B
Weather
231
Chicago, IL: Chicago Midway International
B
Weather
218
Houston, TX: William P Hobby
A
Carrier
208
Houston, TX: George Bush Intercontinental/Houston
B
Weather
195
New York, NY: LaGuardia
B
Weather
187
Conclusion
1. Business Travel Dominates the Industry
The pronounced Monday peak (106,575 flights) and Saturday trough demonstrate that business travel, not leisure,. This weekday-centric pattern shows airlines optimize for corporate travelers returning from weekends and beginning weekly business trips.
While most flights operate smoothly, the presence of 43+ hour delays shows the huge gap between routine operations and worst-case scenarios highlighting the industry’s vulnerability.
3. Small Airports Face Disproportionate Challenges
Southwest Oregon Regional Airport’s 47-minute average delay . Smaller regional airports lack the infrastructure, redundancy, and resources to absorb disruptions, leading to persistent performance problems that affect every departure.
4. Weather Remains the Uncontrollable Wild Card
With weather being the dominant cancellation cause across the majority of airports, airlines face an operational reality that no amount of planning can fully overcome. However, carrier-related cancellations represent controllable failures where operational improvements could make a difference.
Recommendations
Operational Improvements
Target Problem Airports: Direct resources to airports with chronic delay issues (OTH, etc.)
Weather Preparedness: Enhance forecasting and contingency planning for April weather patterns
Schedule Buffer: Build in realistic turnaround times to prevent cascading delays
Early Departure Controls: Implement policies to prevent flights from leaving significantly ahead of schedule
Strategic Priorities
Business Traveler Focus: Given Monday’s dominance, optimize Monday operations and customer service
Regional Airport Support: Invest in infrastructure or operational support for struggling smaller airports
Cancellation Transparency: Provide clearer communication about cancellation causes and rebooking options
Data-Driven Resource Allocation: Use airport-specific and airline-specific delay patterns to guide staffing and equipment deployment
Data Source & Methodology
Data Source: U.S. Department of Transportation, Bureau of Transportation Statistics
Dataset: Airline On-Time Performance and Causes of Delay
Time Period: April 2019
Records Analyzed: 600,000+ individual flight records
Analysis Tools: SQL for data querying, R with ggplot2 for visualization
This section documents the SQL queries used to extract and analyze the flight data from the database. Each query corresponds to a specific analysis question and feeds into the visualizations presented above.
Query 1: Maximum Departure Delays by Airline
This query finds the worst departure delay for each airline and sorts results from smallest to largest.
SELECT l_ai_d.Name,MAX(a.DepDelayMinutes) AS MaxDepDelayFROM al_perf AS aJOIN L_AIRLINE_ID AS l_ai_dON a.DOT_ID_Reporting_Airline = l_ai_d.IDGROUPBY l_ai_d.NameORDERBY MaxDepDelay ASC;
Results: 17 airlines | Used in: Section 2 - Maximum Departure Delays
Query 2: Maximum Early Departures by Airline
This query identifies the earliest departures (in minutes) for each airline, sorted from largest to smallest early departure.
SELECT l_ai_d.Name,-MIN(a.DepDelay) AS EarlyDepFROM al_perf AS aJOIN L_AIRLINE_ID AS l_ai_dON a.DOT_ID_Reporting_Airline = l_ai_d.IDGROUPBY l_ai_d.NameORDERBY EarlyDep DESC;
Results: 17 airlines | Used in: Section 2 - Maximum Early Departures
Query 3: Flight Volume by Day of Week
This query ranks days of the week by total flight volume, with 1 being the busiest day.
SELECT l_w.DayAS DayOfWeek,COUNT(*) AS NumFlights,RANK() OVER (ORDERBYCOUNT(*) DESC) AS FlightRankFROM al_perf AS aJOIN L_WEEKDAYS AS l_wON a.DayOfWeek = l_w.CodeGROUPBY l_w.DayORDERBY FlightRank ASC;
Results: 7 days | Used in: Section 1 - Flight Volume by Day of Week
Query 4: Airport with Highest Average Departure Delay
This query identifies the single worst-performing airport based on average departure delay (treating early departures as 0 delay).
SELECT l_a.Name, l_a.Code AS Code,AVG(a.DepDelayMinutes) AS AverageDepDelayFROM al_perf AS aJOIN L_AIRPORT AS l_aON a.Origin = l_a.CodeGROUPBY l_a.Name, l_a.CodeORDERBY AverageDepDelay DESCLIMIT1;
This query uses CTEs to find each airline’s most problematic departure airport based on average delay.
WITH avg_delays AS (SELECT a.DOT_ID_Reporting_Airline AS airline_id, a.OriginAirportID AS airport_id,AVG(a.DepDelayMinutes) AS avg_delayFROM al_perf AS aGROUPBY airline_id, airport_id),max_delays AS (SELECT airline_id,MAX(avg_delay) AS max_avg_delayFROM avg_delaysGROUPBY airline_id)SELECT lairline.Name AS AirlineName, lairport.Name AS AirportName, md.max_avg_delay AS MaxAvgDepDelayFROM max_delays AS mdJOIN avg_delays AS adON md.airline_id = ad.airline_idAND md.max_avg_delay = ad.avg_delayJOIN L_AIRLINE_ID AS lairlineON ad.airline_id = lairline.IDJOIN L_AIRPORT_ID AS lairportON ad.airport_id = lairport.IDORDERBY MaxAvgDepDelay DESC;
Results: 17 airline-airport combinations | Used in: Section 3 - Worst Airport for Each Airline
Query 6a: Total Cancelled Flights
This query counts the total number of cancelled flights in the dataset.
SELECTCOUNT(*) AS NumCancelledFlightsFROM al_perfWHERE Cancelled =1;
Results: 14,488 cancelled flights | Used in: Section 4 - Total Cancellations
Query 6b: Most Frequent Cancellation Reason by Airport
This query identifies the primary cancellation reason at each departure airport.
WITH reason_counts AS (SELECT OriginAirportID AS airport_id, CancellationCode AS reason_code,COUNT(*) AS num_cancellationsFROM al_perfWHERE Cancelled =1GROUPBY airport_id, reason_code),max_counts AS (SELECT airport_id,MAX(num_cancellations) AS max_cancelFROM reason_countsGROUPBY airport_id)SELECT lairport.Name AS AirportName, rc.reason_code AS CancellationCode, lcancel.Reason, rc.num_cancellations AS NumCancellationsFROM reason_counts rcJOIN max_counts mcON rc.airport_id = mc.airport_idAND rc.num_cancellations = mc.max_cancelJOIN L_AIRPORT_ID lairportON rc.airport_id = lairport.IDJOIN L_CANCELATION lcancelON rc.reason_code = lcancel.codeORDERBY NumCancellations DESC;
Results: 317 airports | Used in: Section 4 - Primary Drivers of Flight Cancellations
Query 7: Daily Flights with 3-Day Rolling Average
This query calculates a 3-day rolling average of flight volumes using a window function.
WITH daily_flights AS (SELECT FlightDate,COUNT(*) AS NumFlightsFROM al_perfGROUPBY FlightDate)SELECT FlightDate, NumFlights,AVG(NumFlights) OVER (ORDERBY FlightDateROWSBETWEEN3PRECEDINGAND1PRECEDING ) AS AvgFlights_Preceding3DaysFROM daily_flightsORDERBY FlightDate;
Results: 31 days | Used in: Section 1 - Daily Flights vs 3-Day Rolling Average
Key SQL Techniques Used
Window Functions: RANK() OVER and AVG() OVER for rankings and rolling averages
Common Table Expressions (CTEs): Multi-step queries for complex aggregations (Queries 5 & 6b)
Multiple JOINs: Connecting fact tables with lookup tables for readable results
Aggregate Functions: MAX(), MIN(), AVG(), COUNT() for summarization
Conditional Filtering: WHERE Cancelled = 1 to isolate specific subsets
Full SQL Script: All queries are available in SQL_exports/Nkemdibe_Okweye_no252_Mini_Project_Problems.sql